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The  least-mean-square  (LMS)  algorithm  is  the  most  often  used  real-time 
adaptive  filtering  algorithm  due  to  its  computational  simplicity  and  remarkably 
good  fit  to  the  optimal  Wiener  solution.  There  have  been  many  transform  domain 
algorithms  proposed  for  improving  the  convergence  rate  of  the  LMS  algorithm,  the 
most  popular  of  which  has  been  the  Discrete  Fourier  Transform.  (DFT).  However, 
the  DFT  requires  complex  arithmetic,  and  thus  has  proven  computationally 
undesirable  for  applications  involving  only  real  signals.  A  number  of  unitary,  real 
transforms  have  been  proposed  as  less  costly  replacements  for  the  DFT.  These 
include  the  Discrete  Cosine  Transform  (DCT),  the  Discrete  Walsh-Hadamard 
Transform  (WHT),  and  the  Power -of-Two  Transform  (P02).  Each  of  these  in  some 
way  exhibits  a  property  necessary  to  speed  the  convergence  rate,  at  a  lower 
computational  cost  than  the  DFT.  This  work  investigates  the  use  of  another  real 
transform,  the  Discrete  Hartley  Transform,  (DHT).  for  adaptive  system 
estimation,  and  adaptive  echo  cancelling.  It  is  shown  that  the  DHT  performs 
better  than  these  other  real  transforms  under  most  circumstances.  Its 
relationship  to  the  DFT  is  such  that  it  can  be  transformed  into  the  DFT  with 


simple  algebraic  manipulation. 
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CHAPTER  1 
INTRODUCTION 


Lately,  research  has  been  conducted  on  the  use  of  orthogonal  transforms  to 
speed  the  convergence  rate  of  adaptive  systems.  Work  has  been  done  to  examine 
the  usefulness  of  certain  well-known  transforms  for  this  purpose  [7.8],  and  new 
orthogonal  transforms  have  been  designed  to  operate  effectively  under  a  given  set 
of  circumstances  [9].  When  the  system  is  required  to  adapt  under  many  different 
signalling  conditions,  it  becomes  difficult  to  select  a  transform  that  works  well 

under  all  circumstances.  There  is  a  need  to  compare  many  of  these  established 

transforms  to  one  another  and  to  rank  them  according  to  their  ability  to  provide 
improved  performance  in  adaptive  filters.  This  research  provides  such  a 
comparison  for  most  of  the  known  transforms  used  for  adaptive  filtering  the 
Discrete  Fourier  Transform  (DFT),  Discrete  Cosine  Transform  (DCT),  and 
Walsh-Hadamard  Transform  (WHT).  It  also  compares  the  less  widely  known 
Power-of-Two  Transform  (P02)  and  introduces  the  Discrete  Hartley  Transform 
(DHT)  to  the  field.  The  Hartley  Transform  will  be  shown  to  improve  adaptive 

behavior  more  consistently  than  the  other  transforms,  at  an  equal  or  lower 

computational  cost. 

1.1  Past  Work,  and  Intent 

A  considerable  amount  of  the  preliminary  work  in  transform  domain  adaptive 
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filtering  studies  was  done  here  at  the  University  of  Illinois  by  D.  F.  Marshall,  and 
J.  R.  Kreidle  [7.9,17].  Their  work  was  based  on  a  combination  of  the  LMS 
algorithm  developed  by  Widrow  and  Hoff  [2]  and  the  transform  domain  LMS 
algorithm  developed  by  Narayan  et  al.  [8].  The  objective  of  this  research  is  to 

improve  upon  their  results  and  to  present  a  hierarchical  ordering  of  transform 

performance  in  FIR  adaptive  filters,  as  determined  from  a  -series  of  computer 
experiments  in  which  these  transforms  were  used  to  speed  the  convergence  rate 
of  the  adaptive  LMS  algorithm.  Considered  in  these  experiments  are  the  Discrete 
Fourier,  Discrete  Cosine,  Discrete  Walsh-Hadamard.  and  Power-of-Two 
Transforms.  Several  authors  have  shown  the  usefulness  of  these  suboptimal 

transforms  for  various  inputs  [7,8,9].  This  research  will  attempt  to  rank  these 

transforms  and  will,  in  addition,  introduce  another  suboptimal  transform,  the 

Discrete  Hartley  Transform,  that  will  be  shown  to  perform  better  under  most 

conditions  than  any  of  the  above. 

It  is  hoped  that  the  transform  domain  adaptive  filtering  algorithms 
discussed  in  this  work  will  find  practical  application  in  many  telecommunications 

problems.  One  possible  application  occurs  in  a  telephone  channel  carrying  voice  or 
data  signals  with  the  need  for  echo  cancelling.  The  channel  will  be  modelled  as  a 
low-pass  system,  which  the  adaptive  filter  will  try  to  match  when  presented 

with  various  colored  noise  inputs.  It  is  important  that  the  echo  canceller 

converge  in  the  presence  of  many  different  signalling  conditions,  since  modern 
telecommunications  networks  are  used  for  more  than  simple  voice  transmission, 

and  these  other  signals  (data,  tones,  etc.)  sometimes  make  convergence  difficult 

in  the  time  domain.  The  situation  is  improved  by  using  the  transformed  inputs. 
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but  results  are  not  uniform  with  different  transforms.  A  second  criterion  is 
that  the  method  used  to  speed  convergence  must  fit  within  the  practical 
constraints  of  modern  VLSI  technology  for  size  and  power  consumption.  The 
interest  in  the  real  transforms  stems  from  this  stipulation,  because  the  T 
algorithms  that  are  known  require  more  space  and  power  than  some  situations 
will  allow.  While  this  is  the  application  in  mind  throughout  these  studies,  it  is 
not  claimed  that  the  model  used  accurately  represents  a  channel  of  a  telephone 
system:  it  is  merely  an  easily  controllable  test  case  that  serves  to  study  the 
performance  of  these  transform  domain  algorithms. 


CHAPTER  2 


TRANSFORM  DOMAIN  ADAPTIVE  FILTERING 

2.1  Background:  The  IMS  Algorithm 

The  tapped-delay-line  FIR  adaptive  filter  that  forms  the  basis  of  this  work 
is  shown  in  Figure  2.1.  The  input  signal  vector  is  defined  as 

X(n)=[x(n)  x(n-1)  —  x(n-N-l)]' 

and  the  weight  vector  is 

A(n)=[a(n0)  a(n-, )  —  a(nN-l)]1 

where  '  denotes  transpose.  The  input  X(n)  is  assumed  to  be  a  sequence  of 
zero-mean,  stationary  random  variables  which  are  not  necessarily  uncorrelated. 
The  filter  output  is  a  scalar  y(n)  given  by 

y(n)=X(n)'A(n). 

To  generate  the  error  needed  for  adaptation,  the  desired  response  d(n).  a  scalar,  is 
compared  to  the  output  y(n)  to  give  e(n)=d(n)-y(n).  The  error  is  used  by  the  LMS 
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Figure  2.1 

The  tapped  delay  line  adaptive  filter 
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algorithm  to  update  the  weight  vector  ACn)  with  each  sample  according  to  the 
expression 


A(n+1  )=A(n)*-2-jrX(n)-e(n) 

where  p  is  a  constant  called  the  step  size  which  is  usually  determined 
experimentally. 

At  convergence,  when  .4  is  properly  chosen,  the  expected  value  of  the  weight 
vector  A(oo)  is  the  Wiener  solution 

E[A(«)];Rxx(i.j)-1  Rxd(i,j)  i.j=0.1 . N-1. 

The  Wiener  solution  describes  the  optimal  solution  (solution  of  least  mean  square 
error).  The  main  attraction  of  the  LMS  algorithm  is  its  simplicity,  which  comes 
from  the  fact  that  it  uses  an  estimate  of  the  gradient  rather  than  the  true 
gradient.  Any  misadjustment  of  the  solution  can  be  attributed  to  this  gradient 
estimate. 

It  can  be  seen  that  the  IMS  algorithm  converges  very  quickly  if  the  input  is 
uncorrelated,  stationary,  and  the  unknown  system  is  a  linear  FIR.  if  we  relax  one 
or  more  of  the  conditions,  however,  the  algorithm  will  not  converge  as  rapidly  to 


the  Wiener  solution.  In  particular,  if  the  stipulation  of  uncorrelated  input  signal 
is  relaxed,  as  will  happen  under  all  practical  signalling  conditions,  transform 
domain  techniques  appear  to  be  useful  to  retain  the  simplicity  of  the  LMS 
algorithm,  while  elevating  the  performance  to  the  level  of  more  complex  methods. 
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To  process  the  input  data  in  a  transform  domain  it  must  first  be  multiplied 
by  a  unitary  matrix  W,  the  transform  matrix,  which  contains  orthogonal  rows  and 
columns  to  form  a  new  input  vector  Z(n)=W-X(n).  Figure  2.2  shows  the  adaptive 

filter  with  the  transform  matrix  in  place. 

2.2  Interpretation 

The  desired  effect  'of  transforming  the  input  vector  is  to  partially  "whiten" 

the  signal  in  an  attempt  to  increase  the  efficiency  of  the  LMS  algorithm.  A 
"white"  signal,  also  called  a  deccrrelated  signal,  is  one  whose  correlation  matrix 
Rxx  is  diagonal,  with  all  diagonal  elements  equal  to  one.  The  transform  brings 

about  this  whitening  by  partially  diagonalizing  the  input  correlation  matrix,  and 
then  equalizing  the  values  along  the  diagonal  (the  eigenvalues  of  the  matrix)  to 
some  extent  by  normalizing  them  with  a  power  factor  cft2=E[|  xt(n)  | 2]  (Equation 

(2.1))  [17].  An  attempt  has  been  made  to  model  the  orthogonal izat ion  process  as 

the  action  of  a  bank  of  band-pass  filters,  with  each  orthogonal  basis  function 

representing  the  center  frequency  of  the  associated  filter  [7].  Thus  for  an  eight 

tap  system,  as  was  used  in  the  following  simulations,  there  are  eight  equally 
spaced  bands  around  the  normalized  frequency  axis.  Each  filter  has  associated 
with  it  one  point  at  which  the  input  is  completely  decorrelated,  and  many  points 
at  which  varying  amounts  of  correlation  remain.  It  is  this  partial  nature  of  the 
decorrelation  that  makes  transform  techniques,  less  than  ideal.  Only  one 
transform,  the  Karhunen-Loeve,  is  able  to  completely  decorrelate  any  input.  Other 
transforms  succeed  to  varying  degrees.  An  attempt  was  made  to  describe  this 
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Figure  2.2 

An  FIR  adaptive  filter  using  an  orthogonal 
transform  to  improve  convergence 
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decorrelation,  and  to  find  a  way  to  measure  a  transform's  effectiveness.  The 
literature  was  surveyed,  and  a  measure  was  found  for  the  decorrelating  ability  of 
a  unitary  transform  for  a  class  of  inputs  described  as  Markov-1  processes.  A 
Markov- 1  process  is  an  autoregressive  process  whose  statistics  are  first  order. 
The  decorrelation  measure  was  well  defined  for  this  .type  of  process,  but  was 
not  defined  for  any  other  order  of  input  statistics.  Perhaps  this  is  so  because 
the  first  order  processes  are  much  simpler  than  any  other.  An  attempt  was  made 
to  use  the  decorrelation  measure  for  other  types  of  processes.  The  measure  was 
found  to  be  invalid  for  processes  other  than  first  order  Markov.  A  further 
attempt  was  made  to  create  a  similar  measure  for  other  processes.  This  was 
also  unsuccessful.  The  analysis  was  left  at  this  point  in  favor  of  an 
experimental  study  of  transforms'  effectiveness  that  comprises  the  body  of  this 
work.  The  design  of  a  useful  measure  of  decorrelation  remains  a  valid  area  of 
research  for  the  future  [18,19]. 

2.3  The  Karhunen-Loeve  Transform 


An  optimal  orthogonal  transform  for  adaptive  filtering  must  be  one  that 
will,  for  any  input  vector,  result  in  completely  uncorrelated  outputs,  with  equal 
power  in  each  sample  (in  other  words,  white  noise).  This  is  necessary  because 
the  IMS  algorithm  will  only  converge  quickly  if  the  input  is  in  this  form.  The 
only  transform  capable  of  completely  decorrelating  any  input  vector  is  the  well 
known  Karhunen-Loeve  Transform.  The  transform  matrix  contains  elements  which 
are  functions  of  the  input  autocorrelation  matrix,  so  some  specific  knowledge  of 


the  input  is  needed  to  implement  this  method.  The  problem  is  it  is  very  difficult 
to  update  these  input  statistics  on  line  in  real  time.  An  estimate  of  the 
autocorrelation  can  be  made  as  the  data  are  input  to  the  system,  but  this  will  add 
another  time-varying  filter.  Thus,  it  is  of  interest  to  find  suboptimal 
transforms  with  fixed  statistics  that  can  perform  reasonably  well  on  a  wide 
variety  of  input  signals. 


CHAPTER  3 


THE  TRANSFORMS 


3.1  A  Complex  Transform:  The  Discrete  Fourier  Transform 

The  DFT  is  a  familiar  example  of  an  orthogonal  transform.  If  no  specific 
type  of  input  is  assumed,  then  the  DFT  decomposes  the  input  into  components 
which  are  not  completely  uncorrelated  [10].  If.  on  the  other  hand,  the  input  is 
periodic,  the  Fourier  coefficients  will  be  uncorrelated.  For  the  DFT,  the  disjoint 
filter  interpretation  provides  a  useful  insight.  For  an  eight-point  DFT.  there  are 
eight  points  in  the  Fourier  domain  where  there  is  zero  overlap  of  the  band-pass 
filters  that  are  spaced  equally  around  the  unit  circle  in  the  z-plane.  This  means 
there  are  only  eight  frequencies  where  the  process  is  truly  decorrelated. 

If  the  input  was  periodic  with  frequency  content  only  at  these  eight  points, 
the  DFT  would  produce  an  uncorrelated  output.  The  autocorrelation  matrix  of  a 
periodic  input  is  given  by  the  diagonal  matrix 

Rzz(i.j)  s  diag  |r2Z(i.i)|;  i=0 . N-1  (3.1) 

where  the  rzz  are  the  diagonal  elements  of  the  matrix. 

The  average  power  in  a  sequence  is  defined  by  az2  =  rzz(i,i).  Therefore,  if 

the  eight  components  in  the  input  are  of  different  average  powers,  then  Equation 


(3.1)  ts  a  diagonal  matrix  with  unequal  elements.  To  finish  the  "whitening"  of 
this  case,  it  is  simply  necessary  to  multiply  each  diagonal  element  by  the  inverse 
of  the  average  power.  This  is  the  power  normalization  of  Equation  (2.1). 

The  above  is  a  very  special  case,  where  the  input  has  only  N  frequency  values 
for  an  N-pomt  transform.  For  other  inputs  the  degree  of  decorrelation  can  be 
estimated  to  a  degree  by  using  the  disjoint  filter  interpretation  [7].  The 

decorrelation  in  the  general  case  will  be  incomplete,  so  a  system  comprised  of  a 
partially  decorrelated  input  should  not  be  expected  to  converge  as  quickly  as  one 
whose  input  is  white.  However,  the  transformed  input  should  converge  more 

quickly  than  the  time  domain  algorithm  with  the  same  input. 

3.2  The  Real  Transforms 

Real  orthogonal  transforms  differ  from  the  OFT  by  the  fact  that  they 
require  only  real  multiplications  and  additions  to  calculate  an  output  from  a  real 
input,  while  the  OFT  will  require  complex  mathematics  even  with  a  real  input. 
The  four  real  transforms  that  are  used  are  the  Discrete  Cosine  Transform  (DCT), 
Discrete  Hartley  Transform  (DHT),  Discrete  Walsh-Hadamard  Transform  (WHT),  and 
Power-of-Two  Transform  (P02). 

3.2.1  The  discrete  cosine  transform 

The  DCT  is  defined  by 
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where  k  =  0 . N-1  and 


1  .  k  =  0 

V2 

c(k)  =  l.k  s  1 . n-1 

O.k  elsewhere 


A  major  disadvantage  to  the  DCT  is  that  it  does  not  have  an  FFT-like  fast 
algorithm,  although  fast  algorithms  have  been  derived  for  a  large  set  of  specific 
inputs  [11].  So  for  each  value  of  N,  the  fast  algorithm  for  the  DCT  changes, 
rather  than  being  an  extension  of  a  general  form.  This  can  make  the  transform 
less  desirable  under  the  most  general  of  circumstances,  that  is.  when  the  input  is 
not  only  of  unknown  type,  but  the  order  of  the  system  is  not  fixed  either. 


3.2.2  The  discrete  Walsh-Hadamard 


From  the  point  of  computation,  the  WHT  is  the  most  desirable.  The  basis 
functions  consist  of  sampled  Walsh  functions,  which  are  an  orthogonal  set  of 
piecewise  constant  binary  valued  functions.  The  first  six  Walsh  functions,  and 
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the  matrix  for  the  transform  with  N=8  are  shown  in  Figures  3.1a  and  3.1b 
respectively. 

Even  if  the  WHT  were  not  a  strongly  performing  transform,  the  simplicity 
of  its  calculation  would  make  it  appealing  in  many  applications  where 
computation  is  more  critical,  and  only  some  convergence  improvement  is  needed 
over  the  time  domain.  Calculation  of  the  fast  algorithm  requires  no 
multiplications,  as  all  matrix  elements  are  ±1.  and  only  Nlog2N  additions.  The 
fast  algorithm  has  the  same  form  as  the  FFT,  and  can  in  fact  be  calculated  from 
the  FFT  by  making  all  variables  real,  and  setting  all  the  complex  exponentials 
equal  to  ±1. 


3.2.3  Power-of-two  transform 


The  P02  is  a  transform  custom-designed  at  the  University  of  Illinois  to  be 
useful  in  transform  adaptive  digital  filtering,  while  keeping  a  simple 
computational  form.  It  is  called  the  Power-of-Two  Transform  because  the 
elements  of  the  transform  matrix  were  required  not  only  to  be  real,  but  also  to 
be  simple  powers-of-2.  This  implies  that  in  practice  the  P02  can  be 
implemented  entirely  without  multiplication,  i.e.,  it  requires  only  shifting  and 
adding  [9]. 

The  transform  matrix  for  the  P02  with  N=8  is  shown  in  Figure  3.2.  With 
this  length  transform,  only  four  power-of-2  levels  are  used.  It  is  thought  that 
for  longer  transform  lengths,  more  distinct  levels  of  powers  would  appear  [10], 
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Figure  3.2  a  specially  designed  Power-of-Two  Transform  (n=8) 


CHAPTER  4 


THE  HARTLEY  TRANSFORM 


4.1  The  Continuous  Time  Hartleu  Transform 


The  Hartley  Transform  is  a  real  integral  transform  named  for  Ralph  V.  Hartley, 
who  formalated  it  and  published  the  results  of  his  findings  in  the  Proceedings  of  the 
Institute  of  Radio  Engineers  [12],  in  1942.  Given  a  time  function  V(t).  a  transform  pair 
V(t)  <->  \^(co)  is  defined  by  Equations  (4.1)  and  (4.2)  as 


co)=  1/ 


y2Tt  f  ■ 

J  -C 


V(t)  cas  cot  dt 


r  00 

)=  1//2 rc  cas  cot  dco 

J  -00 


where 


cas  t  a  cos  t  ♦  sin  t . 


It  is  seen  that  the  transform  and  its  reciprocal  have  exactly  the  same  form,  and  since 
V(t)  is  real,  the  transform  is  also  real.  Furthermore,  if  we  write  the  Fourier 


Transform  as 


S(o>)s  1/ 


V(t)  exp(-icot)  dt 


and  its  inverse  as 


.«\y.y,y 


V(t)=  1/V2TT 


J 


S(co)  exp(io)t)  do, 

-oo 


we  can  convert  the  Hartley  Transform  to  the  Fourier  by  a  simple  algebraic  manipulation 
[4].  Let  y/(co)  =  e(co)  ♦  o(o).  where  e(o)  and  o(o)  are  the  even  and  odd  parts  of  v//(oo). 
respectively.  Then 

f  00 

e(o)  =  ty(o)  ♦  ^(-o)]/2  =  \/j2n]  VCt)  cos  cot  dt 

J  -00 

and 

r  00 

o(oo)  =  ty(o)  -  ^(-o)]/2  =  1/v/2rc  V(t)  sin  ot  dt. 

J  -00 

So  given  y/(o). 


S(o)  =  e(o)  -  io(co). 


Conversely,  given  SCo). 

^(op)  =  Re[S(o)]  -  Im[S(o)]. 

This  shows  that  the  Hartley  Transform  allows  the  calculation  of  the  Fourier  Transform 
without  the  use  of  complex  arithmetic,  resulting  in  savings  both  in  computation  and 
storage  in  situations  where  the  Fourier  Transform  is  desired. 

In  the  case  cf  the  Hartley  Transform,  there  are  a  number  of  known  properties  that 


can  streamline  calculations.  These  are  expressed  as  theorems  like  those  of  the  Fourier 
Transform,  and  in  most  cases  have  been  developed  from  a  corresponding  Fourier 
theorem.  They  are  described  in  detail  in  the  literature  [4]. 


For  the  Hartley  Transform  to  be  of  use  in  the  problem  of  adaptive  filtering,  there 
must  be  a  discrete  Hartley  Transform,  and  an  FFT-like  fast  algorithm  accompanying  the 
continuous  case.  Fortunately  both  of  these  have  been  worked  out  in  some  detail  [3, 4, 5, 6]. 

If  we  substitute  for  time  a  discrete  variable  z,  which  can  take  on  N  integral 
values  from  0  to  N-1,  we  can  define  the  Discrete  Hartley  Transform  (DHT)  as 

N-1 

H(u)  =  N-1  Z  ?(*)  cas(2TCur/N)  _  .  . 

r=0 

and  its  inverse  as 

N-1 

f(r)  =  Z  H(u)  cas(2nu-c/N) 

0=0 

where  cas(x)  was  defined  above.  Note  that  in  the  discrete  case,  as  with  the 
continuous  Hartley  Transform,  the  transform  and  its  inverse  have  the  exact  same  form. 
Similarly,  we  expect  a  simple  algebraic  relationship  between  the  DHT  and  the  DFT.  To 
get  the  DFT.  we  again  split  the  DHT  into  even  and  odd  parts 

H(u)  =  E(u)  ♦  0(u) 


where 


0(u)  =  LHCu)  -  H(N-u)]/2. 


Then  the  DFT  is 


F(u)  =  E(u)  -  iO(u). 

Once  again,  conversely,  the  OHT  is 

H(u)  =  Re[F(u)]  -  !m[F(u)]. 

Again,  in  the  discrete  case,  as  in  the  continuous  time  case,  there  are  theorems  for  the 
DHT  that  the  interested  reader  may  wish  to  examine  [3.4],  Figure  4.1  shows  the  form 
of  the  DHT  for  N=8.  Notice  the  elements  of  the  matrix  are  either  ±1.  0,  or  ±^2.  This 
reduces  the  number  of  multiplications  that  must  be  done,  much  like  the  P02  or  WHT 
require. 

4.3  The  Fast  Hartleu  Transform 


It  was  discussed  earlier  that  for  the  DHT  to  be  really  useful,  it  must  have  a  fast 
algorithm.  Since  it  resembles  the  DFT  in  so  many  ways  already,  one  could  surmise  that 
it  would  have  one.  A  Fast  Hartley  Transform  has  been  develooed  that,  given  a  data 
sequence  of  length  N,  where  N  =  2p,  will  reduce  the  number  of  arithmetic  operations  to 
the  order  of  Nlog2N  [6]. 

The  transform  takes  on  a  butterfly  form,  resembling  the  forms  of  the  FFT,  and  is 
shown  in  Figure  4.2.  What  follows  is  a  brief  description  of  each  step  of 
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Figure  4.2 

Fast  Hartley  Transform  flow  diagram 


the  process.  The  purpose  of  PERMUTE  is  to  arrange  the  input  vector  in  data 
pairs.  So  an  eight-element  input  vector  would  be  arranged  as  four  pairs,  etc.. 
The  output  F0(t)  is  the  result  of  the  first  'butterfly',  which  furnishes  the  two 
elements  to  be  combined  for  the  result  called  fi(r).  This  process  continues  a 
total  of  P-1  times.  The  COMBINE  operation  performs  a  function  like  the  bit 

reversal  required  for  the  FFT.  The  result  fp.jfr)  will  be  equal  to  N*H(u). 

Dividing  each  value  of  fp.^fu)  by  N  will  give  the  correct  value  of  H(u).  We  may 
also  calculate  the  DFT  at  this  point,  if  so  desired,  via  the  method  described 
earlier. 


CHAPTER  5 


COMPUTER  EXPERIMENTS 


5.1  Computer  Methods 

All  experiments  were  run  on  the  Cyber-175  in  FORTRAN.  It  was  necessary  to 
generate  a  model  system,  input  vector,  coloring  noise  filter,  and  the  adaptive  LMS 
algorithm  for  each  transform.  An  eight-tap  FIR  low-pass  filter  system  was 
created  with  DFDZR1  [14],  a  FORTRAN  version  of  the  Parks-McClellan  equiripple 
filter  design  program.  The  frequency  response  of  the  resulting  filter,  shown  in 
Figures  5.1a  and  5.1b,  was  then  used  as  the  model  system  for  the  algorithm. 
White  noise  was  generated  by  RANF,  an  intrinsic  function  on  CYBER.  Because  RANF 
results  in  random  samples  between  0.0  and  1.0..  0.5  was  subtracted  from  each 
value  of  RANF  in  order  to  have  a  random  variable  with  zero-mean  to  be  used  as 
the  input  vector.  Four  coloring  noise  filters,  each  also  an  eight-tap  FIR  filter, 
were  generated  to  test  the  tranforms  robustness.  The  program  to  actually 
perform  the  simulations  was  ADAPOP4  [13].  Other  software  was  created  to  plot 
the  resulting  data. 

5.2  Experiments  and  Results 


It  was  hoped  that  by  simulating  the  LMS  algorithm  using  a  different 
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orthogonal  transform  for  each  run  we  could  measure  several  things:  i)  the 
convergence  of  each  transform  versus  an  untransformed  input,  and  2)  the  best 
overall  transform.  The  first  was  necessary  to  illustrate  that  in  no  case  would 
the  results  of  transforming  the  input  be  worse  than  no  transform  at  all.  The 
second  has  to  do  with  the  robustness  of  the  transforms.  It  has  been  shown  that 
for  a  given  input,  there  is  an  optimal  transform  that  will  result  in  a  complete 
decorrelation  of  the  Rxx  matrix.  For  any  such  input  this  is  the  Karhunen-loeve 
Transform,  which  has  been  shown  to  be  unsuitable  for  our  purposes  because  it  is 
not  a  time-invariant  transform.  By  requiring  a  time-invariant  transform,  one 
may  find  behavior  that  is  not  consistent  for  all  inputs.  For  example,  it  ha's  been 
shown  that  for  a  stationary  sample  of  speech,  the  Discrete  Cosine  Transform  will 
most  effectively  decorrelate  the  signal  [8].  One  possible  explanation  for  this  is 
that  the  eigenvectors  of  speech  may  line  up  at  almost  the  same  directions  as  the 
rows  of  the  DCT,  while  the  same  is  not  true  for  the  other  transforms  tested. 
This  does  not  mean  the  DCT  is  necessarily  the  transform  to  use  for  signals  with 
statistics  unlike  speech. 

If  the  input  is  white  noise,  it  was  previously  demonstrated  that  any  of  the 
transformed  inputs  will  converge  as  fast  as,  but  no  faster  than,  the  time  domain 
case  [7].  This  comes  about  because  the  already  random  vector  gams  nothing  from 
the  orthogonalization  process. 

To  compare  the  performance  of  the  transforms  and  the  time  domain,  the 
adaptive  filters  of  Figures  5.2  and  5.3  were  simulated,  using  the  low-pass  system 
model  described  above  as  the  desired  signal,  and  each  of  the  four  colored  input 
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Figure  5.6  The  frequency  response  of  the  band-pass  coloring  niter 
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spectra  shown  in  Figures  5. 4-5. 7  to  shape  the  input  spectrum.  The  adaptive 

system  is  configured  in  the  system  identification  mode:  thus,  the  final  output  of 

the  system  will  be  a  set  of  adapted  coefficients.  A(°o),  representing  the 
estimate  of  model  spectra,  with  the  MMSE  resulting  from  the  adaption 
representing  the  final  misadjustment  of  those  coefficients.  In  most  cases  the 
filter  was  allowed  to  adapt  for  a  sufficient  number  of  iterations  to  reach 
convergence.  Convergence  here  is  defined  as  that  point  where  the  MSE  has  reached 
-300dB,  the  noise  floor  of  the  CYBER.  In  some  instances,  most  notably  several 
time-domain  runs,  the  process  has  not  yet  converged  at  the  end  of  20.000 
iterations.  These  processes  were  terminated  before  they  became  prohibitively 
expensive.  In  all  instances,  however,  enough  adaptation  has  occurred  to  establish 

a  trend  that  can  be  carried  to  convergence  if  necessary. 

The  result  that  was  used  to  judge  the  effectiveness  of  the  transforms  was 

the  adaptive  learning  curve,  a  measurement  of  the  MSE  versus  the  number  of 

iterations.  Another  factor  that  may  prove  useful  in  future  studies  is  the  actual 
set  of  filter  coefficients  generated  by  the  transforms,  which  will  show  how  well 

the  adaptation  worked  in  areas  of  different  power  densities. 

5.3  Comparison  with  the  Time  Domain 

The  first  colored  input  noise  case  is  zero-mean  white  noise  from  RANF 

passed  through  a  32-tap  linear  FIR  high-pass  filter,  whose  frequency  response  is 

shown  in  Figure  5.4.,  then  fed  to  the  time  and  transformed  adaptive  systems. 


The  system  was  first  adapted  using  the  time-domain  LMS  algorithm.  The 
learning  curve  for  this  process  is  shown  in  Figure  5.8.  This  curve  exhibits  quite 

a  bit  of  variation  because  it  was  not  smoothed,  but  the  downward  trend  is 
obvious.  One  can  immediately  see  that  this  system  has  not  converged  to  the  noise 
floor  even  after  20,000  iterations. 

To  compare  each  of  the  transforms  to  the  time  domain,  the  transform 
adaptive  filter  was  trained  with  the  high-passed  noise.  The  same  algorithm  was 
repeated  five  times,  once  with  complex  arithmetic  for  the  DFT,  and  four  times 

using  real  arithmetic  for  the  DOT.  DHT,  WHT,  and  P02.  The  only  difference 
between  runs  was  the  substitution  of  each  transform  in  the  signal  path.'  The 
learning  curves  for  each  transform  are  shown  individually,  superimposed  over  the 
time  domain  in  Figures  5.9-5.13.  Clearly,  these  curves  for  the  transformed 
inputs  descend  more  steeply.  which  translates  to  fewer  iterations  before 

convergence.  Thus,  all  the  transformed  adaptive  filters  can  be  said  to  outperform 
the  time  domain  filter  for  this  type  of  colored  noise  input. 

The  process  of  comparing  the  time  domain  to  the  transformed  domains  was 

repeated  using  three  more  coloring  filters,  low-pass,  band-pass,  and  band-reject, 
each  a  32-tap  linear  FIR  filter  created  with  DFDZR1.  The  frequency  responses  of 
each  are  shown  in  Figures  5. 5-5. 7.  A  time  domain  adaptation  learning  curve  was 
generated  for  each  colored  input,  shown  in  Figures  5.14-5.16.  The  low-pass  and 
band-pass  cases  are  very  similar  to  the  high-pass:  neither  one  has  converged  after 
20.000  iterations.  The  band  reject  case  is  strikingly  different.  Not  only  has  the 
time-domain  filter  converged,  it  has  converged  in  what  appears  to  be  about  7500 
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Figure  5.9  The  DFT  versus  time  for  htgh-pass  noise 
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Figure  5.12  The  WHT  versus  time  for  high-pass  noise 
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Figure  5.13  The  P02  versus  time  for  high-pass  noise 


Figure  5.15  The  learning  curve  for  the  time-domain  filter  trained 
with  band-pass  noise 
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Figure  5.16  The  learning  curve  for  the  time-domain  filter  trained 
with  Dand-reject  noise 


iterations.  This  phenomenon  will  be  commented  on  later. 


Each  transform  was  again  compared  to  the  time  domain,  with  the  same 
result  as  the  first  case.  In  each  of  these  three  cases,  the  transformed  inputs 

required  fewer  iterations  to  reach  convergence.  Thus,  all  the  transformed 

adaptive  filters  can  be  said  to  outperform  the  time  domain  filter  for  these  types 

of  colored  noise  input.  The  series  of  figures  showing  the  resulting  learning 
curves  of  each  transform  over  the  corresponding  time  domain  learning  curve 
begins  with  Figures  5.17-5.21  for  the  low-pass  case,  and  includes  Figures 
5.22-5.26  for  the  band-pass  case,  and  Figures  5.27-5.31  for  the  band-reject  case. 

Since  the  transformed  filters  perform  better  than  the  time  domain  for  each 
of  these  four  well  known  filter  types,  which  in  these  experiments  represent  the 
shapes  of  input  spectra,  and  have  already  been  shown  to  perform  as  well  with 
white  noise,  which  could  be  interpreted  as  the  fifth  common  filter  type,  the 

all-pass  network,  we  can  conclude  that  for  any  practical  system  the  transform 
domain  adaptive  filter  will  be  superior.  As  a  point  of  theory,  it  is  possible  to 

construct  a  transform  that,  given  a  certain  input,  will  actually  display  a 

learning  curve  inferior  to  the  time  domain.  This  theoretical  transform  has  been 

shown  here  experimentally  to  be  not  one  of  any  of  the  transforms  we  have 

examined.  As  it  would  seem  that  the  experiment  accounts  for  most,  if  not  all.  of 
the  likely  combination  of  transforms  and  inputs,  this  theoretical  point  is  of 
little  practical  significance. 
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Figure  5.17  The  OFT  versus  time  for  !ow-oass  noise 
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Figure  5.18  The  DCT  versus  time  for  low-pass  noise 
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Figure  5.20  The  WHT  versus  time  for  low-pass  noise 
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Figure  5.22  The  DFT  versus  time  for  band-pass  noise 
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Figure  5.23  The  DCT  versus  time  for  band-pass  noise 
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Figure  5.25  The  WHT  versus  time  for  band-pass  noise 
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Figure  S.27  The  DFT  versus  time  for  band-reject  noise 
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Figure  5.30  The  WHT  versus  time  for  band-reject  noise 
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Figure  5.31  The  P02  versus  time  for  band-reject  noise 


5000  10000  15000  20000 

NUMBER  CF  ITEPfiTIONS 


Figure  5.32  A  comparison  of  transform  performance  for  high-pass  noise 
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5.4  A  Comparison  of  Transforms 


Once  it  has  been  established  that  the  transform  domain  will  result  tn  faster 
convergence,  it  would  be  interesting  to  determine  which  transform  will  result  i" 
the  greatest  advantage  over  time  for  each  given  input,  and  which  transform  will 
provide  the  greatest  advantage  over  the  range  of  inputs  taken  as  a  whole. 
Another  set  of  figures  has  been  prepared  that  shows  the  learning  curves  for  each 
transform  superimposed  on  one  axis  for  each  colored  noise  input.  Figure  5.32 
shows  the  performance  of  the  five  transforms  with  high-pass  noise.  Because  the 
original  curves  were  noisy,  smoothing  was  performed  by  hand.  The  smoothed 
curves  are  only  to  allow  comparison  of  the  transforms'  relative  performance. 
The  actual  learning  curves  are  available  compared  against  the  untransfcrmed 
adaptive  filter.  The  smoothing  was  necessary  to  allow  unobstructed  viewing  of 
each  curve.  For  the  high-pass  noise,  all  the  transforms  except  the  DFT  allowed 
convergence  in  under  20.000  iterations.  The  fastest  convergence  has  been 
accomplished  by  the  OHT  and  the  P02.  with  the  P02  seemingly  finishing  just 
slightly  ahead. 

A  very  different  result  is  evident  with  the  use  of  the  low-pass  model.  As 
the  set  of  curves  in  Figure  5.33  shows,  only  two  of  the  transforms  have  allowed 
convergence  within  20.000  iterations,  and  one,  the  OCT.  has  converged  almost  as 


4, 


fast  as  if  the  input  were  white  noise.  The  only  other  transform  to  converge,  the 
DHT.  shows  a  precipitous  drop  m  the  first  100  or  so  adaptations,  resulting  in  a 
curve  much  like  the  OCT  for  that  span,  but  then  the  slope  of  the  line  changes,  and 
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at  convergence  the  DHT  has  taken  about  12,500  iterations  to  find  the  noise  floor 
of  the  CYBER. 

For  the  band-pass  filter,  all  the  transforms  allow  convergence  in  less  than 
20,000  iterations.  On  this  run,  the  first  transform  to  converge  is  the  DFT, 
followed  by  the  DHT.  See  Figure  5.34. 

While  the  other  three  colorations  resulted  in  system  convergence  time  on  the 
order  of  20,000  iterations,  the  final  filter,  the  band-reject  case,  has  converged 
in  the  time  domain  in  less  than  10.000  iterations.  When  the  transform  is  added, 

convergence  occurs  in  less  than  2000  iterations  as  a  worst  case.  The  fastest 

transform  in  this  case  appears  to  be  the  DHT.  in  about  1100  iterations,  followed 
by  the  WHT.  See  Figure  5.35. 

For  any  given  input  one  transform  will  be  optimal.  Thus,  for  four 

different  inputs,  there  was  a  different  fastest  transform  for  each.  To  derive  a 

ranking,  a  transform  was  given  a  number  corresponding  to  where  it  finished 
relative  to  the  other  transforms  tested  for  each  input.  A  number  1  was  given  to 
the  transform  that  converged  first,  etc..  Table  5.1  shows  the  numbers  given  to 
all  the  transforms  for  the  four  inputs.  In  some  cases,  ties  have  been  declared 

when  the  learning  curves  are  very  close.  From  the  table,  one  can  see  that  the 
most  consistent  transform  by  far  is  the  DHT.  It  is  the  best  in  one  case,  and 
never  falls  to  lower  than  second  place  on  any  input.  While  in  certain  cases  other 

transforms  may  converge  much  more  quickly,  the  consistency  of  the  Hartley 

Transform  makes  it  more  versatile,  allowing  its  use  m  many  applications, 
including  the  one  outlined  m  the  introduction. 


Computational  complexity  refers  to  the  amount  of  calculation  needed  to 
implement  any  algorithm.  Part  of  the  attraction  of  the  original  LMS  algorithm 

was  the  ease  of  computation  brought  about  by  the  gradient  estimate.  One  of  the 
purposes  of  this  research  was  to  study  ways  to  increase  the  convergence  rate  of 
the  LMS  algorithm,  while  keeping  the  level  of  complexity  to  that  which  could  be 
accommodated  by  established  VLSI  architecture.  The  architecture  presently  in  use 
may  lag  behind  the  state  of  the  art.  or  may  for  some  other  reason  be  unable  to 
accommodate  a  more  complicated  algorithm.  Thus,  the  best  algorithm  can  be 
rendered  useless  in  some  instances  if  the  issue  of  its  computation  is  ignored. 

Any  good  programmer  can  reduce  the  amount  of  computation  necessary  in  any 
process  to  some  extent,  but  a  baseline  will  eventually  be  reached.  In  the  case  of 
the  DFT.  it  is  known  that  to  calculate  the  FFT  of  an  N  point  sequence  requires  on 
the  order  of  Niog2N  complex  multiplies,  and  a  like  number  of  additions  [1].  A 
complex  multiply  requires  four  real  multiplies.  Thus,  the  first  factor  to  be 

examined  is  complex  versus  real  arithmetic.  There  is  no  denying  that  the  Fourier 
is  a  power  fu'  varsform.  However,  it  has  been  shown  that  the  Hartley  Transform 
allows  the  calculation  of  the  Fourier  with  only  real  arithmetic.  Furthermore, 

when  the  transform  rankings  snow  the  Hartley  to  be  a  more  consistent  performer. 
it  becomes  even  more  attractive.  It  has  been  shown  [15,  16]  that  the  DFT  can  he 
calculated  by  using  an  tn-place  Fast  Hartley  Transform  algorithm  about  twice  as 


fast  as  a  normal  complex  FFT.  There  are  some  more  sophisticated  FFT  routines 
that  can  approach  that  speed,  but  the  important  thing  is  that  we  do  not 
necessarily  want  the  OFT.  In  most  cases,  our  purposes  are  better  served  by  the 

chT,  which  is  real,  and  about  twice  as  fast  as  the  FFT.  The  same  savings  are 

available  to  the  other  real  transforms.  The  Walsh-Hadamard  Transform  can  be 
calculated  easily  by  using  the  FFT  structure,  making  everything  real,  and  setting 

all  the  exponential  multipliers  equal  to  one  [20].  The  P02  Transform  can  also  be 
calculated  with  no  multiplications.  Since  all  of  its  values  are  powers  of  two.  the 
computer  need  only  use  shifts  and  adds  to  calculate  the  outputs  [10].  Again,  this 
will  result  in  substantial  computational  savings  over  the  FFT.  but-  the 
performance  of  the  WHT  and  P02,  as  well  as  that  of  the  DCT,  is  not  as 


consistent  as  that  of  the  DHT. 


CHAPTER  6 


CONCLUSIONS 


6.1  Summaru 

A  more  complete  investigation  of  various  transforms  used  to  improve  the 
performance  of  the  adaptive  LMS  algorithm  was  undertaken  in  order  to  provide  a 
hierarchy  of  those  transforms.  The  transforms  studied,  the  DFT.  DCT.  DHT,  WHT. 
and  P02.  were  all  compared  to  the  time-domain  algorithm  for  four  different 
common  types  of  signalling  conditions,  to  show  that  all  did  indeed  provide  some 
advantage  with  each  of  those  types  of  inputs.  The  transforms  were  then  compared 
to  one  another  in  order  to  establish  which  transform  provided  the  best  advantage 
with  each  input,  and  which  one  provided  the  best  overall  performance.  In  the  last 
case,  the  DHT  was  shown  to  be  the  most  consistent  performer,  though  at  a  higher 
computational  cost  than  either  the  WHT  or  P02. 

6.2  Future  Directions 

No  transform  performed  well  enough  under  all  conditions  to  be  declared  the 
best  transform  to  use  with  the  LMS  algorithm,  though  the  Hartley  performed  well 
enough  to  recommend  its  use  if  transform-domain  adaptive  digital  filtering  is 

required.  The  difficulty  in  establishing  a  clear  favorite  experimentally,  coupled 
with  the  failure  to  derive  analytically  a  measure  that  would  show  conclusively 
which  transform  is  the  best  overall,  perhaps  signals  that  this  particular 

direction  has  gone  as  far  as  it  will.  Already  attention  has  turned  to  other 

methods  of  improving  the  performance  of  adaptive  filters.  The  transform  LMS 
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